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Abstract. For the decomposability property is very a practical one in Welfare 
analysis, most researchers and users favor decomposable poverty indices such 
as the Foster-Greer-Thorbeck poverty index. This may lead to neglect the so 
important weighted indices like the Kakwani and Shorrocks ones which have 
interesting other properties in Welfare analysis. To face up to this problem, 
we give in this paper, statistical estimations of the gap of decomposability 
of a large class of such indices using the General Poverty Indice (GPI) and 
of a new asymptotic representation Theorem for it, in terms of functional 
empirical processes theory. The results then enable independent handling of 
targeted groups and next global reporting with significant confidence intervals. 
Data-driven examples are given with real data. 



1. Introduction 

We are concerned in this paper with the statistical estimation of the gap 
of decomposability of the class of the statistical poverty indices in general. 
Suppose that we have some statistic of the functional form J n = J(Y\, Y n ) 
where £ = {Yi, ...,Y n } is a sample of the random variable Y defined on a 
probability space (Q,A, P) and drawn from some specific population. Now, 
suppose that this population is divided into K subgroups Si,..., Sk and 
let us, for each i 6 {1,...,K}, denote the subset of the random sample 
{Yi,...,Y n } coming from Si by Si = {Y~ M , Y nui \ and then put J ni (i) = 
JiXi,ii Y nu i). The statistic J n is said to be decomposable whenever one 
always has 

1 K 
n ^— ' 

i=l 

whatever may be the way in which £ is partitioned into the Si 's (i = 
1, K). This property is a very practical one when dealing with the poverty 
measures or welfare measures in general for the following reason. If we are 
willing to monitor the poverty situation, it may be very useful to target 
some sensitive areas or subgroups. By dividing the population into tar- 
geted groups, and estimating the poverty intensity by Jmi}) (resp. varia- 
tion of poverty by AJ ni (i)) in each group, one would be able to report the 
poverty intensity (resp. global poverty variation) by (1.1) (resp. AJ n = 
n J2i=i n i^-Jni{i)), provided that the samples are the same as it is the case 
in longitudinal data. Thus, decomposability allows an independent handling 
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of poverty for different areas and next an easy reconstruction of the global 
situation. 

Now in the specific case of poverty indices, we mainly have the non- weighted 
ones and the weighted ones. The statistics in the first case are automati- 
cally decomposable and then are mostly preferred by users. However, the 
weighted measures, which in general are not decomposable, have very in- 
teresting properties in poverty analysis. Dismissing them only for non- 
decomposability would result in a disaster. We tackle this problem in this 
paper. Indeed, by estimating the following gap of decomposability 



with significant confidence intervals, we would be able to handle separated 
analyses in the subgroups and report the global case and, at the same time, 
make benefit of the other properties of such statistics. 

The remainder of the paper is organized as follows. In Section 2, we give a 
brief introduction of the poverty measures and to the General Poverty In- 
dex (GPI). In Section 3, we return back to the decomposability problem by 
describing the drawing scheme under which the results are given. In Section 
4, we state the results which are applied to the Senegalese and Mauritanian 
data in Section 5. The proofs are given in Section 6. The concluding re- 
marks are in Section 7. The paper is finished by an appendix in Section 8. 



We consider a population of individuals or households, each of which having 
a random income or expenditure Y with distribution function G(y) = P(Y < 
y). In the sequel, we use Y as an income variable although it might be any 
positive random variable. An individual is classified as poor whenever his 
income or expenditure Y fulfills Y < Z, where Z is a specified threshold 
level (the poverty line). 

Consider also a random sample Yi, Y?, ...Y n of size n of incomes, with empir- 
ical distribution function G n (y) = n _1 ^{li < y : 1 < i < n}. The number 
of poor individuals within the sample is then equal to Q n = nG n (Z). And, 
from now on, all the random elements used in the paper are defined on the 
same probability space (Q,A, P). 

Given these preliminaries, we introduce measurable functions A(p, q, z), w(t), 



(1.1) 




2. A BRIEF REMINDER ON POVERTY MEASURES 



and d{t) of p,q € N, and z,teR. Set B(Q n ) = £ 



Qn 

i=l 



w(i). 
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Let Yi >n < Y^n < ... < Y ntn be the order statistics of the sample Y±, Yz, ...Y n 

of Y. We consider general poverty indices (GPI) of the form 

(2.1) 

GPI n = 8 y nB{Q ^ n) g «,(Min + ^ - ^ + M4) d (^—3— J J , 

where fi±, fi2, M3j A*4 are constants. This global form of poverty indices was 
introduced in [15] (see also [13], [15] and [16]) as an attempt to unify the 
large number of poverty indices that have been introduced in the literature 
since the pioneering work of the Nobel Prize winner, Amartya Sen(1976) 
who first derived poverty measures (see [19]) from an axiomatic point of 
view. A survey of these indices is to be found in Zheng [24], who also 
discussed their introduction, from an axiomatic point of view. We will cite 
a few number of them here just to make clear the minds and prepare the 
data-driven applications in Section 5. 

One may devide the poverty indices into two classes. The first includes 
the nonweighted ones. The most popular of them is the Foster-Greer- 
Thorbecke(1984) [7] class which is defined for a > 0, by 

(2.2) FG T(a) = ^£( ± ^ L ) . 

For a = 0, (2.2) reduces to Q n /n, the headcount of poor individuals. For 
a = 1 and a = 2, it is respectively interpreted as the severity of poverty and 
the depth in poverty. (2.2) is obtained from (2.1) by taking 

5 = I d , w = 1, d(u) = u a , B(Q n , n) = Q n and A(Q n , n, Z) = Q n . 
Next, we have for a > 0, 

the Chakravarty family class of poverty measures is obtained from (2.1) by 
taking Y a and Z a as respectively transformed income Y and threshold Z 
and 

6 = I d , w = l, d(u) = u, B(Q n , n) = Q n and A(Q n , n, Z) = Q n . 

The statistics in this class are decomposable and are not concerned by the 
present work. 

The second class consists of the weighted indices. We mention here two of 
its famous members. The Sen(1976) index (see [19]) 
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(2.3) is obtained from (2.1), by taking 

d(u) = u, w{u) = u, A(Q n , n, Z) = Q n , 

B(Q n ) = Qn(Qn + l)/2, ,Ui = and ^3 = /i 2 = /i 4 = 1- 

The Shorrocks(1995) index (see [21]) 



(2.4) P SH ,n = -E( 2n " 2 i + !) 




3=1 



is obtained from (2.1) by taking 

B(Q n , n) = Q n {Q n + l)/2, A(n, Q n , Z) = Q n {Q n + l)/2n, 

5 = Id, w(u) = (u), d(u) = u, n\ = 2, H2 = 0, /x 3 = 2 /x 4 = 1. 

Measures (2.3) and (2.4) evaluate the poverty intensity by giving a more im- 
portant weight on the poorest individuals. This means that a small decrease 
of the intensity on the poorest household indicates significant improvement 
in the population. 

In the applications, we mainly deal with these two specific measures because 
of their importance in poverty analysis. Notice that the Thon measure ([22]) 
is different from the Shorrocks one only by their normalization coefficients 
which are respectively n(n+l) and n 2 , so that they have the same asymptotic 
behavior. Finally, we have the following generalization of the Sen measure 
given by Kakwani(1980) [11], 



where A; is a positive parameter. Notice that J n (l) is the Sen measure. 
Notice also that, under mild conditions, J n converges in probability to the 
Exact General Poverty Index (EGPI) (see [1], [2], [3] and [13]), 



where L\ is some weight function depending on the distribution function. 
This result will be proved again in Theorem 1 below. 



From now, we suppose that our studied population of households is divided 
into K subgroup such that, for each % G {1,...,K}, the probability that a 
randomly drawn household comes from the i th subgroup is pi > 0, with 
pi + ... +pk = 1- Let us suppose that we draw a sample of size n from the 
population : Yi,...,Y n and let us denote those of the n* observations coming 
from the i th subgroup, (1 < i < K) by Yij, j = l,...,n*. Let J„*(Gj) = 




(2.5) 




3. Statistical decomposability 
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Jnl ■•■) the empirical index measured on the i subgroup and 

J n (G) the global index. Clearly, decomposability implies for all n > 1, 

1 * 

gd n = J n Y) n*J n * = 0. 

n f—f 

Surely, n* = (n^,...n^) follows a multinomial law with parameters n and 
p = (pi, ...,pk)- Since each p, L > 0, we have that for each 1 < i < K, n* — > oo 
a.s., as n — > oo. We will have by (1.1) and by (2.5), 

K K 

gd n = J n (G) - -Y / n*Jn;(G z ) ^ P gd = J{G) - ^PiMd). 
n i=i i=i 
The right member of this equation is the exact gap of decomposability gci. 
It follows that gd is zero if the distribution of the income is the same over 
all the population, that the more homogeneous the income is over the pop- 
ulation, the lower the gap of decomposability gd is. As a first result, we 
get that the decomposability does not, asymptotically at least, matter for 
a more or less homogeneous population. That is, the decomposability is 
not only a functional form matter (of the index) , but it is also a statistical 
one since whatever might be the index, decomposability is asymptotically 
obtained when the subgroups have the same distribution. For example, it 
has been pointed out in ([10]), for the Senegalese poverty databases from 
1996 to 2001, that the gaps of decomposability were very low for various 
stratifications (in regions, gender, ethnic groups, etc.). The apparent reason 
was the homogeneity of the income. Such results are confirmed in Section 
5. 

Now we want to find the law of 

gd* n = Vn(gd n - gd) 

for a more accurate estimation of gd by confidence intervals. At this step, 
we have to precise our random scheme. We put a probability space (fli x 
ft 2 ,Ai <8> ^2, Pi ® P2) and put P = Pi ® P 2 - We draw the observations in 
the following way. In each trial, we draw a subgroup, the ith subgroup (£j) 
having the occurring probability pi. And we put 

= I(the i th subgroup is drawn at the j th trial) 

1 < i < K, 1 < j < n. Now, given that the i th subgroup is drawn at the 
j th trial, we pick one individual in this subgroup and observe its income 
Yj(uj\,uj2)- We then have the observations 

{Yj^!,^), l<j< n}. 

We have these simple facts. First, for 1 < i < K, 

n 

(3.1) < = 5^ 7r »J- 

3=1 
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Secondly, the distribution of Yj given (iti^ = 1), is Gi, that is 

P(Y ] <y/n h , = l) = G l (y), 

Next 

V(y g K), 

K K 

V(Yj < v) = 52n*iJ = mY j <v/*ij = l) = 'Z l PiG i (y). 

i=l i=l 
We conclude that {Y±, ...,Y n } is an independent sample drawn from G(y) 
= T^H=iViGi(y)i the mixture of the distribution functions of the subgroups 
incomes. Finally, we readily see that conditionally on n* = (n*, n^, n* K ) = 
(ni,n 2 , ...,n K ) = n with ni + n 2 + ... + ur = n, {Yjj, 1 < j < n*} are 
independent random variables with distribution function Gi. 

4. Our results 

The results stated here hold for a very large class of poverty measures sum- 
marized in the GPL This is why we need the representation Theorem of the 
GPI in [18]. In fact, we do not need here the complete form of [18], but 
a special case of it, based on the assumptions described below. For that, 
suppose that Gi (1 < i < K), is the distribution function of the income for 
the ith subgroup, and G is the distribution function of the income for the 
global population. Let also j(x) = d(^f^) l(rr<z) an d e(x) = l( x <z)- The 
following assumptions are required. 

(HDO) G (Z) €]0,l[for G G {G,G U ...,G K }. 

(HD1) There exist a function h(p,q) of (p, q) € N 2 and a function c(s,t) of 
(s,t) € (0, l) 2 such that, as n — > +oo, 



max \A(n,Q)h Q)w(/j,in + fi 2 Q - Hsj + Ha) - c(Q/n, j/n)\ 
i<i<Q 

= op(n 



-1/2, 



(HD2) For the function h found in (HD1), there exists a function Tr(s,t) of 
(s, t) € R 2 such that as n — > +oo, 



max 

i<i<0 



w(j)h 1 (n,Q) - -ir(Q/n, j/n) 



op(ri 



-3/2> 



(HD3) The bivariate functions c and it have continuous partial differentials. 
(HD4) For a fixed x, the functions y — > and y — > jfy(x,y) are 

monotone. 

(HD5) Go is strictly increasing for any Go 6 {G, G±, Gk}- 
(HD6) We have for any G £ {G, d, G K } 

< H C (G ) = J c(Go(Z),Go(y)Hy)dGo(y) < +oo 
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and 

< H^Gq) = J TT(G (Z),Go(y))e(y)dGo(y) < +00 
We also need the following definitions, for Go G {G, G±, Gk}, 

J(G ) = H c (G )/H n (G ), 

(4.1) g (-) = H-^GoM) ~ H c (G )H~\GoM-) + K(G )e(-), 
with 

(4.2) &,„(■) = c(Go(Z),G (-))7(-), ^o(-) = *(G (Z), G (-))e(-), 

(4.3) K(G ) = H- 1 (Go)K c (G ) - H C (G )H; 2 (G )K V (G ) 
with 

(4.4) ^ C (G ) = y — (G (Z),sMG \s))ds, 

K n (G ) = J^(Go(Z),s)e(G 1 (s))d S , 

(4.5) ^o(-) = ^(GqKoO - J ff c (G )^ 2 (G )^,o(-), 
where 

MO = fj(Go(Z),G (-)h(-),^o(-) = ^(Go(Z),Go(-))e(-). 

with the conventions that for Go = G, we denote go = g and vq = v. For 
Go = Gj, 1 < i < K, we put 5(0 = <7i an d v o = ^i- Finally define 

(4.6) £i(t) = (g- 9i ) {G-\t)) , a(t) = { Vi v - u t ) {Gr\t)) , < t < 1. 

We are now able to briefly describe the approximation of [18] : if Go fulfills 
(HD1), (HD6), then as n — > +00, we have 

V^(Jn(G ) - J(G )) = a n (g ) + + o P (l), 

where 

1 n 

a n (g ) = -^J29o(G (Vj) - Ego(G (^)) 
is the functional empirical process and 

n 

(4.7) p n (u ) = — {GniVj) - GoiVj)} MVj) 
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is a residual stochastic process introduced in [18] and widely studied in [12], 
where G n is the empirical distribution function associated with {Vi,...,V n } 
sampled from Gq. 

Finally, we introduce these constants of whom the variances of our theorem 
are based on : 



K 



i=l 



rGi(Z) ( rGi(Z) 

l (g-g t ) 2 (G^(t))dt-U (g-g t )(Gr\t))dt 



2\ 



K Gi{z) G . {z) 

A 2 =y]p i / (sAt-stXpiV-Vi^G^is^ipiV-ViXG-^t^dsdt, 

Jo Jo 



K K 



f Gi(Z) f Gi(Z) 

Pi / [G h (G-\s))AG h (G-\t)) 

Jo Jo 



i=l h=£i 

-G h {G-\s))G h {G-\t))] v{G-\s))v(G-\t))dsdt, 



K K K r Gi(Z) r Gi(Z) 

1/2 \ -> 1/2 - — ' 



JO 



G h (G-\s))AG h (G-\t)) 



=1 3+i h(£{i,j} 

-G h {G-\s))G h {G-\t))] v{GT l (s))v(GT\t))dsdt, 

JL fGi(Z) ( rsAGi(Z) 

Bl = S>/ W {g-g^G^it^dt 
-sJ^g-g^G-^t^dt^ipiV-V^G-Hs^ds, 



K K 



B * = Erf E^ / / ' M G t (Gj\t)) - sG^GjHt))} 
i=l j^i Jo Jo 



x{piV -Vi)(GT\8))v(Gr\t))d8&, 



and 



K K 



Gj(Z) ( r G l (G7 1 ( s ))AG l (Z) 



(M-9i){Gr\t))dt 



i=l j^i 



-GiiGj 1 ^)) x j\g-g t )(G7\t))dt}v(Gj l (s))ds, 



where 
and 



ESTIMATES OF GPI GAP OF DECOMPOSABILITY 

ffo(-) =%(■) x e (0 and M-) =I7 o(-) x e(-), 



(9o, vo) G {9, 9i, -,9k) x and i = 1,...,K. 



We are now able to state our main result. 



Theorem 1. lei (HD0)-(HD6) hold. Then gd* n0 = ^/n(gd n - gd ) 
A/"(0, #5 + and c/< = - gd) ~* A/"(0, t?f' + mtt 

0? = Ai + A 2 + A 3 + 2{B 1 + B 2 + B-i) 

2 



4 = J2 F h 2 Ph- (j2 F hPh) 

h=l \h=l J 



for F h = Eg(Y h ) - J(G h ) + Yld=i Pi^G h (Y i )i/(Y i ), and 

2 



4 = j2M h 2 Ph -(j2 M hPh) 
h=i \h=i J 



for M h = Eg(Y h ) + ^f =1 Pi EG h (Y*)v(Y*). 



Remark 1. This clearly makes the so important decompos ability require- 
ment less crucial since the default of decomposability may be estimated by 
confidence intervals based on this theorem, as we showed it in the next sec- 
tion. 



5. Examples and Applications 

5.1. Sen Case. The conditions (HD1), (HD2), (HD3) and (HD4) hold for 
this measure and we have here c(x,y) = x — y and ir(x,y) = y/x. Further 
when (HDO), (HD5) and (HD6) are true, the results of Theorem 1 apply 
with 



J(Gi 



0) = 2 / 
Jo 



G (Z) 



1 - 



G (Z) 



K(G ) = 21 



go(y) 



1 r G °w , \ 



J(G ) 

G (zy 



G (y) \ fZ-y 



Go(Z) 



Z 
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and 



My) 



G (Z) 



Z-y\ + J(G ) 



G (Z) 



(y<z)- 



5.2. Shorrocks' case. We have the same conclusion of the previous case 
with c(x,y) = 2(1 -y),K(G ) = 0, 

rG (Z) 



(5-1) J(G ) = 2^° (l-G (Z))( Z 



ds, 



go(y) = 2(l-G (y)) 



Z-y 



(y<z)i 



and 



My) 



Z-y 



(y<z)- 



5.3. Kakwani case. We also have the same conclusion for the Kakwami 
measure of parameter k > 1 with c(x, y) = (x — y) k and ir(x, y) = y k /x, 



K(G ) 



J(G ) = (k + l 
k(k + l 



Go(Z) 



G (Z) 



: fZ-G^(s) 



ds, 



G (Z) 
J(G ) 



Go(Z) 



G (Z) 



fc-i 



Z-G^(s) 



ds 



G (zy 



9o(y) 



(fc + 1) 
J(G ) 



G { 



>(y)\ k (Z-y 



G (Z)J 



and 



My) = - 



G (y) 
G (Z) \G {Z) 

k(k + l) 



+ K(G )\l (y < z) , 



G (Z) 



1 G (y)\ 
Go(Z)J 



k-l 



Z-y 



+ 



J (G ) fG (y) 



G (Z) \G (Z) 



k-l 



l (y<z)- 



5.4. Data-driven applications. In this note, let us focus on the Sen case, 
which is more tricky than the Shorrocks one. We consider the Senegalese 
database ESAM 1 of 1996 which includes 3278 households. We first consider 
the geographical decomposition into the areas, Dakar is the Capital. We 
have the Sen measure values for the whole Senegal and for its ten sub-areas. 
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Area 


Senegal 


Kolda 


Dakar 


Diourbel 


Saint-Louis 


Louga 


Sen Index 


34.71% 


51.66% 


22.73% 


40.16% 


37.51% 


34.53% 


Size 


3278 


198 


1122 


231 


314 


174 




Area 


Tambacounda 


Kaolack 


Thies 


Fatick 


Ziguinchor 






Sen Index 


47.47% 


37.91% 


41.31% 


42.22% 


39.13% 






Size 


126 


316 


401 


180 


216 





Let us compute the different variances $\ $2 and d\ of Theorem 1 with 
the empirical estimations pi « rii/n,. We obtain for the geographical de- 
composability in Senegal : $\ + tf 2 . = 0.093195, tf 2 + tf 2 = 0.093224 and 
gd n = 1.25450 10~ 3 . This gives the 95%-confidence : 

dg G [-0.00919%, 0.00117%], 

that is 

J{G) G [34.7%, 34.71%], 
We remark the very accurate estimation of the Sen index for the whole coun- 
try of Senegal which makes us tell that this index is practically decomposable 
in this empirical case. We have already explained that decomposability does 
not matter when the distribution is uniform in the population. It happens 
that earlier works show that the senegalese date are well fitted by the lognor- 
mal or the Singh-Maddala model for each area with very similar parameters. 
Now for a decomposition with respect to the household chief gender, we get 
the sen measure values. 



Gender 


Senegal 


Male 


female 


Sen Index 


34.7 % 


35.27 % 


32.62 % 


size 


3278 


2559 


919 



We get here &\ + $\ = 1.87, ■&{ + 0§ = 1.78, gd n = 1.496 x 10- 4 and this 
95%-confidence : 

dg £ [-0.00437%, 0.0016%], 

that is 

J(G) e [34.696%, 34.704%], 
We get the same conclusion that the gap of decomposability is significantly 
very low. 

We have for the Mauritanian data (EPCV 2004) the following geographical 
and gender decomposability estimates. For the whole country and its thir- 
teen sub-areas, we have : 

02 + ^2 = 7j 85 x 1Q -2 ) #2 + ^2 = 7) 85 x 1Q -2 and gdn = 6) 40 x 10~ 4 . This 
gives the 95%-confidence : 
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Area 


Mauritanie 


Hodh Charghy 


Hodh Gharby 


Guidimagha 


Sen Index 


7,5% 


6,73% 


7,59% 


10,89% 


Size 


9360 


1211 


469 


234 



Area 


Adrar 


Nouadhibou 


Tagant 


Tiris Zemmour 


Assaba 


Sen Index 


5,5% 


0,83% 


13,34% 


2,78% 


6,49% 


Size 


568 


585 


490 


284 


514 



Area 


Brakna 


Trarza 


Inchiri 


Gorgol 


Nouakchott 


Sen Index 


11,57% 


9,12% 


4,89% 


12,43% 


3,49% 


Size 


1190 


1217 


205 


796 


1597 



dg e [-0.00503%, 0.00631%] 

For a stratification with respect to the gender of the chief household, we 
have : 



Gender 


Mauritania 


Male 


female 


Sen Index 


7,5 % 


7,46 % 


7,64 % 


size 


9360 


7513 


1847 



#2 + ^2 = 5^ 58 x 10 -2 ; ^2 + ^2 = 5^ 5 g x 10 -2 ; gdn = 3) 99 x 1(r 5 an( J the 

95%-confidence : 

dg £ [-0.004, 74%, 0.00482%], 

Our general conclusion is that for all these cases, the sen measure is almost 
decomposable. But, this does not really matter. The important result is that 
we are able to have an accurate estimation of the gap of decomposability. 

6. Proofs 

To begin, we need more notations to describe the representation result of 
[18], in an appropriate way to our proof. Let Go G {G, G\, Gk} and let 
a sample of incomes {V\, ...,V m } from Gq. Let aG , m the uniform empirical 
functional process based on 

{Go(Vi),...,G (KO}, 

defined by 

^ m 

a G(hm (9o) = -^J29o(Go(Vj) - EgoiGoiVj)), 
Vm 3=1 

and define an other empirical process, called here residual empirical process, 
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where GG ,m is the empirical distribution function associated with {V±, V m }. 
The representation Theorem of Sail and Lo [18] establishes under the hy- 
potheses (HD0)-(HD6), for J(G ) = H c (G )/H n (G ), 

Vm(J m (G ) - J (G Q )) = a Gojm (g ) + Pa ,m(^o) + o P (l) 

as m — > oo, where go and vq are described in (4.1) and (4.5). 



Before going any further, we should precise the notations for the global pop- 
ulation and the subgroups. For G = Go, we drop the subscript Go so that 
a n , f3 n , G n , J n are respectively the empirical, the residual empirical process 

(6.1) , the empirical distribution function and the GPI based on the sample 
Yi,...,Y n , and J = J(G) = H c (G)/H n (G). As well the functions go and vo 
are denoted as g and v for G = Go- For G = Gi, 1 < i < K, we use the sub- 
script i so that OLi^ n * , /3j jTl * , G,^ n * , Jj in * will respectively denote the empirical, 
the residual empirical process (6.1), the empirical distribution function and 
the GPI based on the sample Y^i, Y^ n * , and Jj(Gj) = H c (Gi)/H 7T (Gi), 
accordingly to the notations of Section 4, and the functions go and vq are 
denoted as gi and V{ in this case. But sometimes we may feel the notations 
so heavy and then lessen them. For example, we only put Ji(Gi) = J(Gi) 
and Ji, n *(Gi) = J n *(Gi), i G {1, ..,#}. 

To begin the proof, we remark that n*(u>\) = (n*(wi), n* K (oj\)) — >-p 1 
j+oo}-^ as n = n\(oj\) + ... + n* K (uji) — > oo. We then get 

(6.2) V^(Jn(G) - J(G)) = a n (g) + p n (v) + o P (l) := 7n + o P (l) 
and for any 1 < i < K, 

(6.3) v^f(J n .(Gi)- J(Gi)) = u i , n *(g i ) + p i:n *(n) + op(l) := 7i>n . +o P (l) 
Now we use the intermediate centering coefficient 

K 



i=i 

to get from (6.2) and (6.3) 



gd ,n = J(G)-J2^AGi)- 



(6.4) 



\fn(gd n - gdo, n )(ui,U2) - j 7 n - ( 



n 



li,m } (wi,w 2 ) 
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as n — > oo. Then, we have 



fn*\ 1 

Sn = 7n(</, V) - 7i,n*(ft, "i) 

J=l V 7 

* / n *\V2 K / n *\l/2 

- a„(5)-2Z( — ) «i,nr(5i)+/9nM-5^ ( — ) A.nJ^i)- 
J'=l ^ ^ 7 j=l ^ n 7 

Remark that 



a "(9) = 4 E " E 9(Y)) = ( \ jZ 9(Yj) - Eg(Y) J 

i=i i=i / 



with 



^ 1 ) = E^ i 7#VK%(n, 
i=l v n ^ 



and 



z?*(n,i) = x; !i ^%(nvpi 

This leads to 



/ 1 " ^ * \ ^ / * \ 1/2 

- E ( - M + 

7 = 1 ^ n ' 





Now, by denoting 

( n K * \ if . 1/2 

j=l i=l / i=l ^ ' 

one has 

K /„*\V2 



(6.5) 



C*(n, 1) = £ ( -j" ) £ [(5 - 9i) (Y id ) —E(g — 9i ) (Y*)] . 

i=i V n / V n i 7 =i 
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we get 



K / *\l/2 

(6.6) S* = C*(n, 1) + D*(n, 1) + /?» - E ( " ) 

.7 = 1 V n ^ 



Further one has 

A 



K / *\ . K n i 

7 = 1 ^ ' * i=l 1 = 1 



But 



and for i£R, 



a 



K nl 



= - E 1 «<«) = - E E 1 w J -<«) 



i=l 



i=l j=l 



Thus 



,n* , 1 v- 



K 



«=1 J=l 



i=l 



1 AT n- 

«^EE 

1=1 7 = 1 



a: 



h=l 



E(^)G^(^)-^G,(^) 



From this, we put and subtract Ylh=i(lt)GhO / ij) to nave 



A- « 



=1 J=l 

a: «; 



A 



A=l 

a: 



A 



E ? ^(^i-E ? ««») 



h=l 



+^EE 



E 

h=l 



n 



h 



Ph) G h (Yij) 



(6.8) 



K ni K 



v i=i j=i /i=i 



= -4 EEE ( ) ^(^0 - G fc (y -)] k^o 



A" rij 



+ tsE:e: 



i=i j=i 



A=l 
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Now we put together (6.7) and (6.8), while separating the two cases h = i 
and h ^ i in (6.8) to get 

K , *\l/2 



i=i 



^ / *\l/2 



1= f; {G i>n .(^-) - Gi(^)} f v " - "i) 



1 3=1 



K 



1/2 A" * -. n i 

+ E ( t ) E v^f E K(^) - k^o 



i=i 



A" "I 



1 tl 1 

+^EE 

x /n ^ ^ 



K 



h=l 



(6.9) 
with 

(6.10) 

o> 2 ) = E(v 



and 
(6.11) 

A / *\l/2 A" * 

C-(».3) = E(^) £ * 



i=l j=l 

:C*(n,2) + C*(n,3) + r>*(n,2), 



i=i 



£ {G,, n: - G^)} {^-v - v\ {Y l3 ) 
n i j=i V a / 



r i 

*E[ G ^( y ^-^(^)J K^O- 



/i^j V i j=l 



We arrive, by comparing (6.6) and (6.9), at 

(6.12) S* = C*(n, 1) + C*(n, 2) + C*(n, 3) + £>*(n, 1) + D**(n, 2). 
Let us have a look at 



A' 



h=i 



n 



Ph 



A 



i=l 



By the weak law of large numbers 



n*\ 1 



i=i 



n l n 



A 



> . 



E ( 7T ) ^ E G ^>(^) J> ^wEG^n^) = 



3=1 



That is 



fl**(n,2) = |(^) ff ^ +0P (l). 
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Finally 

(6.13) 
Hence 



=: D*(n,2) + o P (l). 



gd* n = S* n + Vn(gd , n - gd). 



gd* n = C*(n,l) + C*(n,2) + C*(n,3) 



K 



+D*(n,l)+D*(n,2)-J2 



i=i 



n\ — npi 
\/nPi 



MGi)y/pl+Op(l), 



(6.14) 
with 
(6.15) 
and 



= :C*(n)+D*(n)+o P (l). 
C*(n) = C*(n, 1) + C*(n, 2) + C*(n, 3) 



K 



D*(n) = D*{n, 1) + D*(n, 2) - ^ 



i=i 



n\ - npi 



Ji{Gi)\pPi 



K 

£ 

i=i 



rij - npj 

\/Wi 



{Hi + Egiy 1 ) - Ji{Gi))y/p~i 



K 



1=1 



\/Wi 



We have now to prove that gd* n = y/n(gd n — gd) weakly converges to a 
ftf(0,'&i + #2) random variable. For this it suffices, based on 6.14, to prove 
that = C*(n) + D*(n) converges to AA(0,tff + Now put 

N(K) = {n = (m, ...n K ),rii > 0, m + ...,n K = n}. 

Since n* = (n\,...n* K ) — >p 1 {co} K , we find for a fixed e > 0, K positive 
numbers iVj (1 < i < if) such that for rij > iVj (1 < i < if), which implies 
that n > iV = iVi + ... + N K , 

P(3(l < i < K),n* < ATj) < e. 

Let 

Af(K,l) = N(K) n{n = (m,...n^),3(l < i < K),^ < iVj 
and N(K,2) = N(K)\N(K, 1). We remark that conditionally on (n* = n), 
C*(n) becomes C(n), does not depend on wi and only include the indepen- 
dent random variables {Yij, 1 < j < n^, 1 < z < if}. From Lemma 1 below, 



we have 



C(n) ->Af(0,&l). 
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Also conditionally on (n* = n), D*(n) becomes D^n) and we denote it D(n). 
Now for h 2 = -1, 

1> s »(t)=E{exp{htS?)) 
= P ( n * = n)E(exp(htC*(n) + htD*{n))/(n* = n)) 

= P ( n * = n)E(exp(htD(n)) E(exp(faC*(n))/(n* = n)). 

n£jV(K) 

Recall that, by the classical limiting law of the multinomial X-vector, 

K 
i=l 

where (Z±, Z^f is a Gaussian vector with Var(Zi) = 1—pi and Cov(Zi, Zj) 
-■s/NPj, for i / j. Then 

with 

K 

$l = Y. F hPh( i -Ph)- F h p kPhP k 

h=l \<h+k<K 
K / K \ 2 

= Y. F hPh~ \ Y. F hPh) ■ 

h=l \h=l / 

We remark that this is the variance of the function of h G [i-,K] with 
respect to the probability measure Yli<h<KPh&h- 
Put now 

N(K, 1) = N(K) n{n = (m, ...n K ), 3(1 < i < if), n* < 7VJ 
and N{K, 2) = N(K)\N(if, 1). Then 

Y exp(htD(n))F(n* = n)E(exp(htC (n)))) = B(n, 1) + B(n, 2) 

nSN(K) 

with 



|B(n,l) 



(6.16) 
and 

(6.17) 



Y exp(htD(n))F(n* = n)E(exp(faC(n))) 
neN(_ft:,i) 

< P(3(l < z < K),n* < Ni) -> 0, 



B(n,2)- Y exp(-(iM) 2 /2)exp(/itL>(ra))P(n* = n) 

nGN(_ftT,2) 
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< e Yl P ( n * = m ) - £ - 

Finally, for 

(6.18) B*(n,2)= exp(-( 1 ? 1 t) 2 /2)exp(/it J D(n))P(n* =n), 
we are able to use (6.18) and to get 

lim sup B*(n,2)- ^ exp(htD(n))P(n* = n)E(exp(-(iM) 2 /2)) 

neN(K) 

= 0. 

But 

(6.19) Eexp(thD*(n))= ^ exp(htD* (n) / (n* = n))P{n* = n) 

n<m{K) 

= Y exp(htD(n))P{n* = n) exp(-(i? 2 t) 2 / 2 )) 

nSN(X) 

By putting together the previous formulas, and by letting e | 0, we arrive 
at 

^ r (£)^exp(-(tf 2 + #!)t 2 /2). 

This proves the asymptotic normality of dg^ of the theorem corresponding 
to S** . That of cig* corresponds to S*. This latter is achieved by omitting 

the term \/ra^£i(ir ~ Vi)Ji{Gi) in (6.13). This leads to obtained from 
Fh by dropping Ji(Gi). This completes the proofs. 

We now prove this lemma used in the proof. 

Lemma 1. Let C(n) = C(n, 1) + C(n,2) + C(n,3), where the C(n,i) are 
respectively defined in (6.5), (6.10) and (6.11) for i = 1,2,3. Then, as 
n — > +oo, 

C(n) -w AA(0,i9 2 ). 



Proof. Recall that 

(6.20) C(n) = C(n, 1) + C(n, 2) + C(n, 3). 

Let for each i G [1,-f], G ni (i,f) be the functional empirical process based 
on {Gi(Yij), 1 < i < rii}, 1 < i < K}. We consider the three terms in (6.20), 
that is the C(n,i), 1 < i < 3, defined in (6.5), (6.10) and in (6.11), and 
prove that each of them converges to a random variable C(i) depending on 
the limiting Gaussian processes G(i, ■) of G ni (i, •). This is enough to prove 
the asymptotic normality. The variance i9 2 will be nothing else but that of 
(7(1) + C(2) + C(3). Firstly, we treat C(n, 1). Remark that conditionally 
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on (n* = n), the random sequences {Y i: j,l < i < rij, 1 < i < K} are 
independent and only depend on the U2 £ ^2- We have 



Eg) 1/2 an ! (<7,) = ^ 

i=l 



K n K 

i=i j=i 



i=i 



, K rii K 

^EE^)-E© E Mn) 



i=i j=i 



i=i 



and 



( n K \ 

j=l i=l J 

' K rii K 



Then, by (6.5) and replacing n* by rij, i = 1, ...,-ftT, we get 



if 



C(n, 1) = a n (g, 1) f— J a ni (ft) 



A' 



n . 



> . 



( 6 - 21 ) = E © 1 2 1 J= E {(^ - - E - (n» 

i=i 

This implies that 
We finally have that 

A" 

C(n, 1) C(l) = E^ /2<G (^ " ft)^ 1 ). 
i=i 

Since the G (i, (g — gi) G^ 1 ) are independent, centered and Gaussian, we get 
that 

K 

A l = EC 2 (1) = EKEG 2 (i, (g - g t )GT l ) 

i=l 

= J> {Hg - 5,) 2 (^) - (E(g - ft)(y*)) 2 } . 

i=l 

In the sequel we take 



5o (x) = g (x) x e(x) and u (x) = u (x) x e(x), 
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and 



(So>fo) G {9, 9i, -,9k) x {v,v x ,...,v K ) and i = 1,...,K. 
Then we arrive 

K { fGi(Z) ( rGi{Z) \ 



Secondly, one has 



K 1 _ 2 n» ) 

^ 2 ) = E © 7^ E {<Wr«) - Gi(y«)} © - ^ 

i=l [v i j=1 J 



We have 



and thus 



)= E - © - (^) 

= / -£ni(i, s)(Pif - ^)( G ' J rl ( s )) ds + op(l) 
Jo 

= [ G ni (i,s)(p^-^)(Gr 1 (s))ds + o P (l) 

JO 

-> / G(i,s)(p i i/-i/ i )(Gr 1 ( s ))da, 

JO 



(6.22) C(n,2) -> C(2) = W /2 C G(i,s)(piV — v i ){G~ 1 {s))ds. 

i=i ^ 

Finally, one has 

^ 3 ) = E © 1/2 E E [Gn fc (y«) - G fc (y«)i "(TV)- 

i=l fc^i v * i=l 

But, for each fixed i G {1, ..,-K"}, 

-. rii 

^ G n h (^j)-G h {Y, l3 )]v{Y l3 



3=1 



W l {G nh {GT\V n Xi,s)))-G h (GT\V n Si,s)))}^{GT\V n ^s)))d 



s. 



We remember that v is of the form 

v(y) = V a {y)\y<Z) 
where v a is continuous on compact sets [0, L], L > 0. Since, as n — > oo, 

sup \V ni (i,s) — s| — > 0, a.s, 
se(o,i) 
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we see that, for large values of re, theses integrals are performed at most 
on some interval [0, Gi(Z) + e], which includes those s satisfying V ni (i, s) < 
Gi(Z). By the assumptions, the functions v a and G are continuous on such 
compact sets. Thus 




n 
n h 

re- * 
nh Jo 




Next 

're,- 



with 




nh Jo 



Y,[ G n h (^j)-G h {Y l3 )]v(Y l3 

1 3=1 

f & Knh {h,G h {G-\V ni (i,s))) x v(G-\V ni (i,s)))ds 
Jo 

G Knh (h,G h {G-\V ni {i,s))) x v{GT l (s))ds + o P (l). 

f 1 G nh (h, G h (G-\s)) x v(G-\s))ds + R n + o P (l), 
Jo 



Rn= f 1 {G h , lh (h,G h (Gr\v nt (i,s))) -G Knh {KG h {G~\s))}xv(Gr\s))ds 
Jo 

and 

rGi{Z)+e 

\Rn\ < / |G M ^,G,(Gri( Ki ^ 
Jo 

We surely have, by continuity of Gh on (0, G" 1 (G(Z) + e)) , 

sup \G h (G-\V ni {i,s))) - G h {G~\s))\ =a n ^0. 

s<Gi(Z)+e 

We obtain here a continuous modulus of the uniform empirical process (see 
Shorrack and wellner [20], page 531) and then 

sup \{G h ,n h (h, G h {G-\V ni (i, a))) - G Mh (h, G h (Gr l ( s ))}\ = 0{^J-a n log a~). 

s<Gi{Z)+e 



We finally get 



R n = Q (V-a n loga n ) J u{Gr\s))ds -> 



and we arrive at 
(6.23) 



C(n,3) -> C(3) = ^ v^E^ / <&(/*, G^G- 1 ^)) x ^(G~ 1 (s))c/s. 

We are now going to compute the variance based on the independent 
functional Browian bridges G(i, •) which are limits of the functional empirical 
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process G n (i, ■) respectively associated with {Gi(Yij),l < i < n^}, i = 
1,..,K. Straightforward calculations give what comes. First 

K 

A x = EC 2 (1) = J>,EG 2 (i, (g - g^G' 1 ). 
i=i 

We denote k = (g — g^GJ 1 in the sequel for sake of simplicity. Next for 

K x 

C(2) = J2Pi / &{hs){ Pi v- n ){G-\s))ds 
i=i 7o 

we have 

A 2 = E(C 2 (2)) = Vpi / (s At- st)ci(t)ci(s)dsdt 

* rGi(Z) ,Gi(Z) 
j JO JO 

where Cj(i) = (pj (G" 1 ^))- Now for 

C(3) = E v^E^ / G^G^Gr 1 ^))) x ^(G~ 1 (s))c/s, 
i=i fe^i ^° 

we have 

( K K K K K 

a, = e(c 2 (3)) = e 5>(E**.fc) 2 + E^) 1/2 (E^)(E K W 



Put 



i=l /i^i h^i h'y^j 

Ki, h = Ph C &{h,G h {G-\s))) x u(G~ 1 (s))ds, 
Jo 



split A3 into 

(K K 
E^(E^) 
1=1 

and 

a 32 = e(EE (^) 1/2 (E K ^)(E ^)) 2 - 

i=l i^j h^i /iVi 

Now by using the independence of the centered stochastic process G(h, ■ ■ ■ ) 
for differents values of h G {!,..., if}, one gets 



i=l /i^i 



24 MOHAMED CHEIKH HAIDARA * AND GANE SAMB LO ** 

and then 

K K . Gj(Z ) rGi(Z) 



4»i = E«E*# / / [G,(Gr 1 ( s ))AG ?i (Gr 1 (t)) 

i=l ft* ^ Jo 

-G h {Gf(s))G h {GT l (t))]v(GT\s))v{GT\t))dsdt. 



Next, one has 

K K K /•!/•! 



A 32 =E^^( KPj ) 1/2 E PfcPv / / 
i=i j* fr*,fr'*' 

G(h,G h (Gr 1 ( a ))G(/i , ,G v (G7 1 (t)))i/(Gr 1 ( s ))i/(G7 1 (t))dtd a 



= E^ /2 Eff E / [G,(G- 1 ( S ))AG,(G- 1 (t)) 

i=l j* hi{i,j] J ° J ° 

-G h (G7\ s ))G h (GT\t))} v{G-\s))v{G-\t))dsdt, 



Now we have 

K \ / K 



G(1)G(2)= ^EpJ /2 G (i,A)j \^p] /2 f\{i,s) Cl (s) ds 

K K ! 
= E E (P'iPj) 112 J S ) C ( S ) G ti> ^) C i( S ) 



and get 

Si 



(g - g t )(y)dGi(y) - sE(g - 9l )(Y l ) \ Ci (s)ds 



= EG(1)G(2) = Vpi / E(G(M)G(j,4) c^ck 
i=i ^ 

i=i 7o U < 

/-Gi(Z) f psAGi(Z) 

= E^/ o w (s-^xg- 1 ^ 



We have next 



G(2)G(3)= (J2p¥ 2 J*G(i,s)*(8)d8 
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K K J N 

E^ V2 E^ / G(/ l ,G^(G- 1 ( S ))x^(G- 1 ( s ))^ 



i=l h^i 
K K K r l r l 



= EEA"Eft/ / G(i. S G(i,G i (G7 1 (t))c i ( S )i/(G7 1 (t)))d a dt. 
»=ij=i 

It comes that 

K K fGi(Z) r Gj(Z) 

B 2 = EC(2)C(3)=J2pTY,P 1 / 2 / / 
i=i ^ 7o 

[a A G(G7 1 (t)) - S G,(G7 1 (i))] x (pjI7 - V^^G' 1 (s))v(Gj 1 (t))dsdt, 
Now finally for 

G(1)G(3)= (x>} /2 G(i,4)) 

K K J 

EpJ Ef* / G(/ t ,G,(G- 1 ( S ))x l ,(G- 1 ( S ))^ 

i=l ft^i - 70 

= EErf /2 ^ /2 E^ / ^G^Gj^s^G^l,) x u(Gj\s))ds, 
i=i jfr h+i Jo 

where the i\s are defined in 4.6, we have 

£? 3 = EG(1)G(3) 

K i 

= J^jhpj / E £i)G(i, Gi(G7 1 (s))| x u(Gj\s))ds 

i=l 7 ° l> 

-G i (G7 1 ( S ))^ 1 (5-5i)(G- 1 (t))^}^(G7 1 ( S ))d S . 

We have now finished the variance computation, that is 
$ 2 1 = A 1 +A 2 + A 3 + 2{B 1 + B 2 + B 3 ) 
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7. Conclusion 

We just illustrated how apply our results for the Sen Measure and the Sene- 
galese database ESAM I and the Mauritanian EPCV 2004 data. But It 
would be more interesting and instructive to conduct large scale data-driven 
for the West African databases for example, for several measures. It would 
also be interesting to see the influence of the Kakwani parameter k on the 
results. This study is underway. 

8. Appendix 

We would like to provide indications to the reader for using the techniques 
developped here. We have a zipped file at : 

http : / /www /uf rsat.org/lerstad/ sen — decomposabilite.rar 

It includes the executable sendecomp.exe file which performs the computa- 
tion of dg. Here is how to proceed : 

(i) Download the zipped file and unzip him in a folder named, for in- 
stance, sen-decomposabilite. 

(ii) Upload in the sen-decomposabilite folder the following user files : The 
income file dep.txt of size n at most equal to 10000, the equivalent- 
adult file eq.txt of the same size n and finally the labels file labels.txt 
including the names of the different strates. If the income file is 
already scaled for individuals, use an eq.txt file of size n having unity 
at each line. Le nomber of labels is at most equal to 15. They must 
be enumarated from to 1 to KK < 16. 

(iii) Execute sendecomp.exe by clicking on it. The user is prompted to 
provide the income file name, the equivalen-adult file name and the 
labels file name without the suffixs .txt. 

(v) The package provides the sen measures value for the differents strates 
and report the gap of decomposability value. 

(vi) For the user's practice we provided in the zipped folder the following 
income variables (depm.txt), equivalent-adult variable (eom.txt) and 
labels (here areas) file named after regm.txt. 

(vi) If the data size exceeds n = 10000 or the strates number exceeds 
KK = 15, the user is free to write to the authors and adapted 
packages will be provided. 

Finally for those who want to set their own packages in some langage, we 
provide a Visual Basic module including the main program and the subrou- 
tines. 
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